Subspace-constrained partial update method for high-dimensional adaptive processing systems

ABSTRACT

A method is explained for any adaptive processor processing digital signals by adjusting signal weights on digital signal(s) it handles, to optimize adaptation criteria responsive to a functional purpose or externalities (transient, temporary, situational, and even permanent) of that processor. Adaptation criteria for the adaptive algorithm may be any combination of a signal or parameter estimation, and measured quality(ies). 
     This method performs a linear transformation adapting parameters from M to (M 1 +L) dimensions in each adaptation event, such that M 1  weights are updated without constraints and M 0 =M−M 1  weights are forced by soft constraints into an L-dimensional subspace they spanned at the beginning of the adaptation period. The same dimensionality reduction, using the same linear transformation, is applied to the input data. The reduced-dimensionality weights are then adapted using the identical optimization strategy employed by the processor, except with input data that has also been reduced in dimensionality.

CROSS REFERENCE TO RELATED APPLICATIONS

This application for patent claims priority under 35 U.S.C. 119(e), particular §119(e)(3), from U.S. provisional application filed for the invention described therein by the same inventor which was filed on Nov. 3, 2013 by Express Mail Certificate, Post Office to U.S. Patent and Trademark Office, EM Certificate # EQ 338677837 US, and this application and Specification expressly references that original provisional application and incorporates all of that original provisional application's specification and drawings.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT

Not Applicable

FIELD OF THE INVENTION

The present invention relates generally to digital signal processing, with particular emphasis on devices employing one or more adaptive processors which employ large numbers of adaptation weights (also known as high-dimensionally, or highly-, adaptive signal processors). Although referred to generally as “an adaptive signal processor” or as “an adaptive digital signal processor” (‘digital’ being generally understood to refer to the nature of the processing used by the computational aspect, while the signals are generally understood to be analog electromagnetic waveforms), this phrase covers any singleton or combination device (i.e. whether the ‘processor’ comprises a single element, or a non-zero set of interacting elements), and whether the device's digital processing aspect is entirely embodied in physical hardware, or in a combined form of hardware, special-purpose firmware, and general processing purpose software. The term ‘adaptive’ refers to processing that adjusts signal weights to the physical signal(s) transmitted, received, or both, by or through said adaptive processor, in order to optimize an adaptation criteria responsive to a functional purpose or the externalities (transient, temporary, situational, and even permanent) for that processor. Each adaptation criteria for the adaptive algorithm may be any of a signal or parameter estimation, measured quality, or any combination thereof.

BACKGROUND OF THE INVENTION

Highly dimensional adaptive processors (devices employing adaptive processors with large numbers of adaptation weights or parameters) are of interest for a wide variety of applications. These applications include:

-   -   Acoustic echo cancellers, where adaptive noise cancellers         employing finite impulse response (FIR) filters with as many as         2,000 adaptively adjust filter taps are used to remove echoes         induced in long-haul telephony networks.     -   Phased array and MIMO radar systems, where large arrays of         antennas (10-1,000 elements/array) are used to electronically         steer beams at detected targets and nulls at jammers and clutter         sources, by combining signals received by the array and         distributing signals transmitting to the array using large         linear matrix operations.     -   Digital predistortion (DPD) processors, where nonlinear adaptive         processors with large numbers of parameters (e.g.,         Volterra-series approximations of nonlinear processes) are used         to adaptively learn, and digitally invert nonlinear effects         added by high-power amplifiers.     -   Smart Grid networks employing spread spectrum modulation formats         with large spreading factors and adaptive dispreading methods to         separate large numbers of co-channel signals, and to detect and         remove spoofers from the networks.     -   Massively MIMO cellular networks employing base stations with         very large numbers of antenna arrays.

To effect adaptive signal processing in these applications, practical means for adjusting large numbers of weights must be developed and implemented. Techniques that have been developed in past to accomplish this include nonblind techniques, which use a known reference signal (e.g., a training or pilot signal inserted into a signal transmitted to the adaptive processor), “partially blind” techniques which employ a known reference with unknown effects added by the communication channel, e.g., delay caused by clock timing offset and physical distance between the transmitter and receiver, and carrier offset caused by LO offset and Doppler shift between the transmitter and receiver; and fully blind methods that only exploit general structure of the transmitted signal. In many systems, a reference signal can only be made available on a sparse basis, e.g., at the beginning of signal reception, after which the processor must operate using fixed weights without additional training between reference signal reception intervals.

These techniques can also be subdivided into methods with “order-M” (O(M)) or linear complexity, where the real multiply-and-accumulate (RMAC) operations per input data sample needed to adapt the processor is on the order of the number of weights M being adjusted by the processor, and methods with higher-order (e.g., O(M^(v)), where v>1) complexity, where the RMAC's per data sample needed to adapt the processor rises much faster than the number of weights being adjusted by the processor. Typically, the most powerful and effective adaptive processing methods have complexity of high order. This presents significant challenges in applications where the number of adaptation weights M is very large.

Lastly, these techniques can be subdivided into sample-processing methods, where the processor weights are adapted every time a new input data sample is provided to the processor, and block-processing methods, where a block of input data is received and used to adapt the processor. In some cases, the algorithm may circulate through the data block multiple times before moving onto the next processing block. Again, the more powerful and effective adaptive processing methods employ block processing, typically with a blocksize N that is (in many cases, must be) a large multiple of M. However, the cost of this processing is reduced update rate; reduced response time to changes in channel effects affecting the adaptive processor; and (e.g., for multiple passes through the data block) additional increase in complexity.

It should also be noted that the operations referred to above are the “adapt-path” operations used to train the adaptive processor, not the “data-path” operations used to implement the adaptive processor during and after training. Adapt-path operations are used to tune the adaptive processor used to process a set of signals, while data-path operations are used to process a set of signals during and after tuning. For most of the applications described above (the DPD application being a notable exception), the data-path operations have O(M) complexity, regardless of the complexity of the adapt path.

To address the adapt-path complexity issue in particular, the concept of a partial update (PU) method (PUM; in the plural, PUMs) that only updates a subset of M₁ weights during each adaptation block or sample (referred to hereafter as a block with size N=1) has been proposed for a number of applications. All PUMs developed to date can be interpreted as linearly-constrained optimization techniques, in which the original method is adjusted by applying a hard linear constraint that forces M₀=M−M₁ weights to remain at the same value between adaptation blocks or samples. The subset of weights actually adapted during each data block, or during each of several passes through a data block, are changed during each adaptation event, so that every weight is updated over the course of multiple adaptation events.

This approach has substantive limitations in practice. First, the linear constraint, by its nature, can induce severe misadjustment from the optimal solution sought by the processor. This can manifest as either or both a convergent or steady-state bias from the optimal solution, and a “jitter” or fluctuation about that steady-state solution. In some applications, e.g., phased array radar applications where the received radar waveform must be extracted from strong clutter and jamming, this can cause the system to fail entirely (studies of PUMs showing “convergence-in-mean” to optimal solutions are almost always conducted under assumptions of little-or-no noise and removable multipath distortion). Even if the processor signal of interest is received at high signal-to-interference-and-noise ratio (SINR), this can lead to well-known “hypersensitivity” issues which degrades the system performance from the optimal solution.

Second, the linear optimization constraint can only be easily added to a small subset of O(M²) optimization functions, e.g., “least-squares (LS)” or LS-like methods that can be formulated as a quadratic optimization problem, or O(M) “least-mean-squares (LMS)” or LMS-like methods that are either intended to approximate LS optimization algorithms (e.g., by replacing gradients with “stochastic gradient” approximations), or that can themselves be formulated as linearly constrained quadratic optimization problems (e.g., “normalized LMS (NLMS)” and “Affine Projections” algorithms). In many cases, adherence to the constraint significantly increases complexity of the original method, and approximations, e.g., using Lagrange multipliers in which the multiplier itself is added to the algorithm, only increases the misadjustment of the algorithm.

In summary, the current PUMs developed to data can only be used with a small number of O(M²) methods, and cannot be used with any O(M^(v)) methods where v>2. This is particularly unfortunate, because the PUM should have its strongest utility with these classes of methods. This is especially evident when the complexity of the data-path processing, which as noted above is typically O(M), is added to the adapt-path processing: at best for O(M) adapt-path methods, the PUM will only reduce overall complexity by 50%. This is the background in which the present invention takes form.

SUMMARY OF THE INVENTION

The present invention is a method for implementing partial-update methods (PUMs) in any adaptive processor that adjusts weights to optimize an adaptation criterion in a signal estimation or parameter estimation algorithm. In the preferred embodiment, the method does this by performing a linear transformation of the processor parameters being adapted from M-dimensions to (M₁+L)-dimensions in each adaptation event, where M₁ weights are updated without constraints, and M₀=M−M₁ weights are subjected to L soft constraints that forces them into an L-dimensional subspace spanned by the those weights (preferentially, those weights are a scaled replica of the original weights) at the beginning of the adaptation event, and where M₁ and L are much smaller than M (M₁<<M and L<<M). Preferentially, L is equal to unity (L=1), i.e., the M₀ constrained weights are forced into a single-dimensional subspace spanned by those weights.

The same dimensionality reduction is also applied to the input data, using the same linear transformation. The reduced-dimensionality weights are then adapted using exactly the same optimization strategy employed by the adaptive processor, except with input data that has also been reduced in dimensionality.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention is illustrated in the attached drawings as described herein.

FIG. 1 is a view of an optimization approach used in a system employing a prior-art nonblind single-port unconstrained adaptation algorithm.

FIG. 2 is a view of an optimization approach used in a system employing a prior-art nonblind single-port partial-update (PU) adaptation algorithm.

FIG. 3 is a view of an optimization approach used in a system employing a nonblind single-port subspace-constrained partial update (SCPU) adaptation algorithm.

FIG. 4 is a view of a nonblind single-port SCPU adapt-path weight update procedure, depicting use of projection matrices (implemented using simple multiplexing and demultiplexing operations) to separate data and weights into unconstrained and subspace-constrained components, allowing use of an unconstrained single-port weight adaptation algorithm of arbitrary type and structure after the subspace separation procedure.

FIG. 5 is a view of a nonblind uncoupled multipart SCPU adapt-path weight update procedure, depicting use of projection matrices (implemented using simple multiplexing and demultiplexing operations) to separate data and weights into unconstrained and subspace-constrained components, allowing use of parallel banks of reduced-complexity unconstrained single-port weight adaptation algorithms of arbitrary type and structure after the subspace separation procedure.

FIG. 6 is a view of a nonblind fully-coupled multipart SCPU adapt-path weight update procedure, depicting use of projection matrices (implemented using simple multiplexing and demultiplexing operations) to separate data and weights into unconstrained and subspace-constrained components, allowing use of an unconstrained multipart weight adaptation algorithm of arbitrary type and structure after the subspace separation procedure.

DETAILED DESCRIPTION OF THE DRAWINGS

FIG. 1 is a view of an optimization approach used in a system employing a prior-art nonblind single-port unconstrained adaptation algorithm. On a first path, a vector processor [1] provides a sequence of data vectors x(n_(sym))=[x₁(n_(sym)) . . . x_(M)(n_(sym))]^(T), each data vector having dimension M×1, where n_(sym) is a symbol index, and where M is a real positive integer, referred to here as the degrees-of-freedom (DoF's) of the system, and (•)^(T) denotes the matrix transpose operation. As part of a data-path processing procedure, the data vector sequence x(n_(sym)) is then passed through a linear combiner [3] that performs a matrix multiplication of x(n_(sym)) by a weight vector w=[w₁ . . . w_(M)]^(T) having dimension M×1, resulting in an output data scalar y(n_(sym))=x^(T)(n_(sym))w having dimension N×1.

As part of an adapt-path processing procedure that is a focus of the invention, the data vector sequence x(n_(sym)) is also passed into a bank of M 1:N serial-to-parallel (S/P) convertors [2] that converts the vector data sequence into a sequence of data matrices X(n)=[x(nN+1) . . . x(nN+N)]^(T), each data matrix having dimension N×M, where N is a real-positive integer, referred to here as the block length of the adaptation algorithm, and n is an adapt block index.

On a second path, and also as part of an adapt-path processing procedure, a sequence of reference scalars s(n_(sym)) is provided by a reference generator [4], each reference scalar having dimension 1×1. In the nonblind adaptation algorithm shown in FIG. 1, the reference scalars s(n_(sym)) are known at the receiver, and are correlated with some component of the data vector x(n_(sym)) in some known manner; however, in other system implementations, the reference scalars may be members of a set of possible known received signal components, or may be derived from the output data vector in some manner.

The reference scalars are then passed into a single 1:N serial-to-parallel (S/P) convertor [5] that converts the scalar symbol sequence into a sequence of reference vectors s(n), each vector reference data symbol having dimension N×1. The reference vector s(n) is then compared with the data matrix X(n) (from the bank of M 1:N serial-to-parallel converters [2]) over each adapt block, and used to generate a weight vector w using an unconstrained adaptation algorithm [6] that adjusts every element of w to optimize a metric of similarity between the output data vector y(n)=X(n)w and the reference vector s(n), e.g., the sum-of-squares error metric F(w;n)=∥s(n)−X(n)w∥₂ ², where ∥•∥₂ denotes the L2 vector norm. The weights are then passed to the data-path processor [3], where they are used to process the input data vectors on a symbol-by-symbol basis.

It should be noted that the data matrices and reference vectors do not need to be contiguous, internally or between adapt blocks on the adapt-paths. However, the input data matrices and reference vectors should have internally consistent symbol indices.

FIG. 2 is a view of an optimization approach used in a system employing a prior-art nonblind single-port partial-update (PU) adaptation algorithm. On a first path, a vector processor [1] provides a sequence of data vectors x(n_(sym))=[x₁(n_(sym)) . . . x_(M)(n_(sym))]^(T), each data vector having dimension M×1, where n_(sym) is a symbol index, and where M is a real positive integer, referred to here as the degrees-of-freedom (DoF's) of the system, and (•)^(T) denotes the matrix transpose operation. As part of a data-path processing procedure, the data vector sequence x(n_(sym)) is then passed through a linear combiner [3] that performs a matrix multiplication of x(n_(sym)) by a weight vector w=[w₁ . . . w_(M)]^(T) having dimension M×1, resulting in an output data scalar y(n_(sym))=x^(T)(n_(sym))w having dimension N×1.

As part of an adapt-path processing procedure that is a focus of the invention, the data vector sequence x(n_(sym)) is also passed into a bank of M 1:N serial-to-parallel (S/P) convertors [2] that converts the vector data sequence into a sequence of data matrices X(n)=[x(nN+1) . . . x(nN+N)]^(T), each data matrix having dimension N×M, where N is a real-positive integer, referred to here as the block length of the adaptation algorithm, and n is an adapt block index.

On a second path, and also as part of an adapt-path processing procedure, a sequence of reference scalars s(n_(sym)) is provided by a reference generator [4], each reference scalar having dimension 1×1. In the nonblind adaptation algorithm shown in FIG. 2, the reference scalars s(n_(sym)) are known at the receiver, and are correlated with some component of the data vector x(n_(sym)) in some known manner; however, in other system implementations, the reference scalars may be members of a set of possible known received signal components, or may be derived from the output data vector in some manner.

The reference scalars are then passed into a single 1:N serial-to-parallel (S/P) convertor [5] that converts the scalar symbol sequence into a sequence of reference vectors s(n), each vector reference data symbol having dimension N×1.

On a third path, and also as part of an adapt-path processing procedure, an update-set selection algorithm [7] is used to generate a sequence of M₁-element update-sets

₁(n)={m∈{1, . . . , M}: m(1;n), . . . , m(M₁;n)} and complementary M₀-element held-sets

₀(n)={m∈{1, . . . , M}: m∉

₁(n)} over each adapt block, such that M₀=M−M₁,

₀(n)∪

₁(n)={1, . . . , M}, and

₀(n)∪

₁(n)={ } within adapt block n. The set selection strategy can be adjusted using deterministic, random, pseudo-random, or data-derived methods. In the partial-update optimization approach shown in FIG. 2, the update-set held-set {

₁(n),

₀(n)} are further used to generate update-set and held-set projection matrices [9] {M₁(n),M₀(n)}, where M_(l)(n)=[e_(M)(m_(l))]_(m) _(l) _(∈)

_(l) _((n)) for l=0,1, and where e_(M)(m_(l))=[δ(m−m_(l))]_(m=1) ^(M) is the m_(l) ^(th) M×1 Euclidean basis vector and δ(k) is the Kronecker delta function.

The reference vector s(n) is then compared with the data matrix X(n) over each adapt block, and used to generate a weight vector w using a hard-constrained adaptation algorithm [8] that adjusts only the elements of w in the update-set, i.e., (w)_(m∈)

₁ _((n)), to optimize a metric of similarity between the output data vector y(n)=X(n)w and the reference vector s(n), e.g., the sum-of-squares error metric F(w;n)=∥s(n)−X(n)w∥₂ ², while holding the elements of w in the held-set, i.e., (w)_(m∈)

₀ _((n)), equal to the same values held by those weight elements over the previous adapt block. This can be expressed as optimization of existing weight vector w to form new weight vector w′, subject to hard linear constraint M₀ ^(T)(n)w′=M₀ ^(T)(n)w. The weights are then passed to the data-path processor [3], where they are used to process the input data vectors on a symbol-by-symbol basis.

It should be noted that the data matrices and reference vectors do not need to be contiguous, internally or between adapt blocks on the adapt-paths. However, the input data matrices and reference vectors should have internally consistent symbol indices.

FIG. 3 is a view of an optimization approach used in a system employing a new nonblind single-port subspace-constrained partial-update (SCPU) adaptation algorithm. On a first path, a vector processor [1] provides a sequence of data vectors x(n_(sym))=[x₁(n_(sym)) . . . x_(M)(n_(sym))]^(T), each data vector having dimension M×1, where n_(sym) is a symbol index, and where M is a real positive integer, referred to here as the degrees-of-freedom (DoF's) of the system, and (•)^(T) denotes the matrix transpose operation. As part of a data-path processing procedure, the data vector sequence x(n_(sym)) is then passed through a linear combiner [3] that performs a matrix multiplication of x(n_(sym)) by a weight vector w=[w₁ . . . w_(M)]^(T) having dimension M×1, resulting in an output data scalar y(n_(sym))=x^(T)(n_(sym))w having dimension N×1.

As part of an adapt-path processing procedure that is a focus of the invention, the data vector sequence x(n_(sym)) is also passed into a bank of M 1:N serial-to-parallel (S/P) convertors [2] that converts the vector data sequence into a sequence of data matrices X(n)=[x(nN+1) . . . x(nN+N)]^(T), each data matrix having dimension N×M, where N is a real-positive integer, referred to here as the block length of the adaptation algorithm, and n is an adapt block index.

On a second path, a sequence of reference scalars s(n_(sym)) is provided by a reference generator [4], each reference scalar having dimension 1×1. In the nonblind adaptation algorithm shown in FIG. 3, the reference scalars s(n_(sym)) are known at the receiver, and are correlated with some component of the data vector x(n_(sym)) in some known manner; however, in other system implementations, the reference scalars may be members of a set of possible known received signal components, or may be derived from the output data vector in some manner.

The reference scalars are then passed into a single 1:N serial-to-parallel (S/P) convertor [5] that converts the scalar symbol sequence into a sequence of reference vectors s(n), each vector reference data symbol having dimension N×1.

On a third path, an update-set selection algorithm [7] is used to generate a sequence of M₁-element update-sets

₁(n)={m∈{1, . . . , M}: m(1;n), . . . , m(M₁;n)} and complementary M₀-element held-sets

₀(n)={m∈{1, . . . , M}: m∉M₁(n)} over each adapt block, such that M₀=M−M₁,

₀(n)∪

₁(n)={1, . . . , M}, and

₀(n)∪

₁(n)={ } within adapt block n. The set selection strategy can be adjusted using deterministic, random, pseudo-random, or data-derived methods. In the partial-update optimization approach shown in FIG. 3, the update-set held-set {

₁(n),

₀(n)} are further used to generate update-set and held-set projection matrices [9] {M₁(n),M₀(n)}, where M_(l)(n)=[e_(M)(m_(l))]_(m) _(l) _(M) _(l) _((n)) for l=0,1, and where e_(M)(m_(l))=[δ(m−m_(l))]_(m=1) ^(M) is the m_(l) ^(th) M×1 Euclidean basis vector and δ(k) is the Kronecker delta function.

The reference vector s(n) is then compared with the data matrix X(n) over each adapt block, and used to generate a weight vector w using a subspace-constrained adaptation algorithm [10] that adjusts the elements of w in the update-set, i.e., (w)_(m∈)

₁ _((n)), to optimize a metric of similarity between the output data vector y(n)=X(n)w and the reference vector s(n), e.g., the sum-of-squares error metric F(w;n)=∥s(n)−X(n)w∥₂ ², while optimizing the elements of w in the held-set, i.e., (w)_(m∈)

₀ _((n)), to a scalar multiple of the values held by those weight elements over the previous adapt block. This can be expressed as optimization of existing weight vector w to form new weight vector w′, subject to subspace constraint M₀ ^(T)(n)w′=g₀M₀ ^(T)(n)w, where g₀ is an unknown scalar that is also optimized by the algorithm. The weights are then passed to the data-path processor [3], where they are used to process the input data vectors on a symbol-by-symbol basis.

It should be noted that the data matrices and reference vectors do not need to be contiguous, internally or between adapt blocks on the adapt-paths. However, the input data matrices and reference vectors should have internally consistent symbol indices.

FIG. 4 is a view of a nonblind, single-port, SCPU adapt-path weight update method, depicting use of projection matrices (implemented using simple multiplexing and demultiplexing operations) to separate data and weights into unconstrained and subspace-constrained components, allowing use of an unconstrained single-port weight adaptation algorithm of arbitrary type and structure after the subspace separation procedure. Over adapt block n, the N×M data matrix X(n) provided by the 1:N S/P bank [2] (not shown) is separated into an N×M₁ dimensional update-set data matrix X₁(n)=X(n)M₁(n) and an N×M₀ dimensional held-set data matrix X₀(n)=X(n)M₀(n), using a columnar matrix demultiplexer (DMX) [11] and the update-set and held-set projection matrices provided over adapt block n [7].

An M×M₀ dimensional held-set projection matrix M₀(n) is additionally used to extract the M₀×1 dimensional held-set combiner weights w₀=M₀ ^(T)(n)w from the M×1 dimensional combiner weights w stored in current memory [12], e.g., computed in prior adapt blocks, using a held-set weight extractor [13]. These held-set combiner weights w₀ are used to multiply the held-set data matrix X₀(n) from the columnar matrix demultiplexer (DMX) [11] through a linear combiner [14], yielding a N×1 held-set output data vector y₀(n)=X₀(n)w₀ . X₁(n) and y₀(n) are then combined into an N×(M₁+1) enhanced data matrix {tilde over (X)}(n)=[X₁(n) y₀(n)] using a column-wise multiplexing (MUX) operation [15].

The enhanced data matrix {tilde over (X)}(n) is then input to an unconstrained weight adaptation algorithm [16] that adjusts every element of an (M₁+1)×1 enhanced combiner vector

$\overset{\sim}{w} = {\quad\begin{pmatrix} w_{1} \\ g_{0} \end{pmatrix}}$

to optimize a metric of similarity between an N×1 reference vector s(n) provided by a reference generator [4] (not shown) and an N×1 output data vector y(n)={tilde over (X)}(n){tilde over (w)} that would be provided by an (M₁+1)-element linear combining operation (not shown). The unconstrained weight adaptation algorithm [16] optimizes the same metric as the unconstrained weight adaptation algorithm [6] depicted in prior art FIG. 1, e.g., the sum-of-squares error metric F(w;n)=∥s(n)−{tilde over (X)}(n){tilde over (w)}∥₂ ². However, the complexity of the unconstrained weight adaptation algorithm is O((M₁+1)^(v)) in [16], rather than O(M^(v)) in [6], where v is the complexity order of the algorithm, e.g., v=2 if a sum-of-squares metric is used in both Figures.

The updated (M₁+1)×1 enhanced combiner vector {tilde over (w)} is then demultiplexed (DMX'd) [17] into an updated M₁×1 update-set weight vector w₁ comprising the first M₁ elements of {tilde over (w)}, and a new held-set scalar multiplier g₀ comprising the last element of {tilde over (w)}. The held-set scalar multiplier g₀ is then multiplied by the current M₀×1 held-set weights w₀ [18] to form updated held-set weights w₀←w₀g₀, and multiplexed (MUX'd) [19] with the updated M₁×1 update-set weight vector w₁, in accordance with the current update-set selection algorithm [7], to form updated M×1 dimensional weight vector w=M₁(n)w₁+M₀(n)w₀. This weight vector is then stored in memory [12], allowing its use as an initial combiner weight vector in a subsequent adapt block. The weight vector can also be used in the data-path linear combiner (not shown) for parallel or subsequent data-path processing operations used in the overall system.

FIG. 5 is a view of a nonblind, multiport, uncoupled SCPU adapt-path weight update method, depicting use of projection matrices (implementing using simple multiplexing and demultiplexing operations) to separate data and weights into unconstrained and subspace-constrained components, allowing use of parallel banks of reduced-complexity and unconstrained, single-port, weight adaptation algorithms of arbitrary type and structure after the subspace separation step. Over adapt block n, the N×M data matrix X(n) provided by the 1:N S/P bank [2] (not shown) is separated into an N×M₁ dimensional update-set data matrix X₁(n)=X(n)M₁(n) and an N×M₀ dimensional held-set data matrix X₀(n)=X(n)M₀(n), using a columnar matrix demultiplexer (DMX) [10] and the update-set and held-set projection matrices provided over adapt block n from the update-set selection algorithm [7]. On each output port p, where p=1, . . . , P where P is the number of weight ports adapted by the overall algorithm, the M×M₀ dimensional held-set projection matrix M₀(n) is additionally used to extract the M₀×1 dimensional port p held-set combiner weights w₀(p)=M₀ ^(T)(n)w(p) from the M×1 dimensional port p combiner weight vector w(p) stored in current memory [20], e.g., computed over a prior adapt block, using a held-set weight extractor [13]. The held-set data matrix X₀(n) is then multiplied by the port p held-set combiner weights w₀ (p) from the held-set weight extractor [13] through a linear combiner [14], yielding N×1 w₀(p) held-set output data vector y₀(n;p)=X₀(n)w₀(p) for each port p. X₁(n) and y₀(n;p) are then combined into an N×(M₁+1) enhanced port p data matrix {tilde over (X)}(n;p)=[X₁(n) y₀(n;p)] using a column-wise multiplexing (MUX) operation [15]. The port p enhanced data matrix is then input to an unconstrained weight adaptation algorithm [21] that adjusts every element of an (M₁+1)×1 port p enhanced combiner vector {tilde over (w)}(p)=

$\quad\begin{pmatrix} {w_{1}(p)} \\ {g_{0}(p)} \end{pmatrix}$

to optimize a metric of similarity between an N×1 port p reference vector s(n;p) provided by a reference generator [4] (not shown) and an N×1 output data vector y(n;p)={tilde over (X)}(n;p){tilde over (w)}(p) that would be provided by an (M₁+1)-element linear combining operation (not shown). The unconstrained weight adaptation algorithm [21] optimizes the same metric as the unconstrained weight adaptation algorithm [6] depicted in prior art FIG. 1, e.g., the sum-of-squares error metric F(w;n,p)=∥s(n;p)−{tilde over (X)}(n;p){tilde over (w)}∥₂ ². However, the complexity of the unconstrained weight adaptation algorithm is O((M₁+1)^(v)) in [21], rather than O(M^(v)) in [6], where v is the complexity order of the algorithm, e.g., v=2 if a sum-of-squares metric is used in both Figures. Additionally, the weight adaptation algorithm can exploit commonality between the enhanced data matrices {{tilde over (X)}(n;p)}_(p=1) ^(P), i.e., the common update-set data matrix X₁(n) contained within each enhanced data matrix, to share results of operations on X₁(n) performed for each algorithm port, resulting in an additional reduction on computational complexity for the overall multiport processor.

The updated (M₁+1)×1 port p enhanced combiner vector {tilde over (w)}(p) is then demultiplexed (DMX'd) [17] into an updated M₁×1 port p update-set weight vector w₁(p) comprising the first M₁ elements of {tilde over (w)}(p), and a new port p held-set scalar multiplier g₀(p) comprising the last element of {tilde over (w)}(p). The held-set scalar multiplier g₀ (p) is then multiplied by the current (from the held-set weight extractor [13]) M₀×d port p held-set weights w₀ (p) [18] to form updated port p held-set weights w₀(p)←w₀(p)g₀(p), and multiplexed (MUX'd) [19] with the updated M₁×1 port p update-set weight vector w₁(p), in accordance with the current update-set selection algorithm [7], to form updated M×1 dimensional port p weight vector w(p)=M₁(n)w₁(p)+M₀(n)w₀(p). This weight vector is then stored in memory [20], allowing its use as an initial combiner weight vector in a subsequent adapt block. The weight vector can also be used in a port p data-path linear combiner (not shown) for parallel or subsequent data-path processing operations used in the overall system.

FIG. 6 is a view of a nonblind, multiport, fully-coupled SCPU adapt path weight update method, depicting use of projection matrices (implemented using simple multiplexing and demultiplexing operations) to separate data and weights into unconstrained and subspace-constrained components, allowing use of an unconstrained multiport weight adaptation algorithm of arbitrary type and structure after the subspace separation procedure. Over adapt block n, the N×M data matrix X(n) provided by the 1:N S/P bank [2] (not shown) is separated into an N×M₁ dimensional update-set data matrix X₁(n)=X(n)M₁(n) and an N×M₀ dimensional held-set data matrix X₀(n)=X(n)M₀(n), using a columnar matrix demultiplexer (DMX) [11] and the update-set and held-set projection matrices provided over adapt block n from the update-set selection algorithm [7]. The M×M₀ dimensional held-set projection matrix M₀(n) is additionally used to extract the M₀×P dimensional held-set combiner weights W₀=M₀ ^(T)(n)W from the M×P dimensional combiner weight matrix W stored in current memory [20], e.g., computed over a prior adapt block, using a held-set multiport weight extractor [22]. The held-set data matrix X₀(n) is then multiplied by the M₀×P held-set multiport combiner weights W₀ through a linear combiner [23], yielding N×P multiport held-set output data vector Y₀(n)=X₀(n)W₀. X₁(n) and Y₀(n) are then combined into an N×(M₁+P) dimensional enhanced data matrix {tilde over (X)}(n)=[X₁(n) Y₀(n)] using a column-wise multiplexing (MUX) operation [24].

The enhanced data matrix is then input to an unconstrained multiport weight adaptation algorithm [25] that adjusts every element of (M₁+P)×P enhanced multiport combiner matrix {tilde over (W)}(p)=

$\quad\begin{pmatrix} W_{1} \\ G_{0} \end{pmatrix}$

to optimize a metric of similarity between an N×P reference vector S(n) provided by a multiport reference generator (not shown) and an N×P output data matrix Y(n)={tilde over (X)}(n){tilde over (W)} that would be provided by an (M₁+P)×P element linear combining operation (not shown), e.g., the sum-of-squares error metric F({tilde over (W)}; n)=∥S(n)−{tilde over (X)}((n){tilde over (W)}∥_(F) ², where ∥•∥_(F) denotes the Frobenius matrix norm. However, the complexity of the unconstrained weight adaptation algorithm is O(P(M₁+1)^(v)) in [25], where v is the complexity order of the algorithm, e.g., v=2 if a sum-of-squares metric is used to optimize {tilde over (W)}.

The updated (M₁+P)×P dimensional enhanced combiner vector {tilde over (W)} is then demultiplexed (DMX'd) [26] into an updated M₁×P dimensional update-set weight matrix W₁ comprising the first M₁ rows of {tilde over (W)}, and a new P×P dimensional held-set multiplier matrix G₀ comprising the last P rows of {tilde over (W)}. The held-set multiplier matrix G₀ is then multiplied by the current M₀×P held-set weights W₀ [27] to form updated held-set weights W₀←W₀G₀, and multiplexed (MUX'd) [28] with the updated M₁×P update-set weight vector W₁, in accordance with the current update-set selection algorithm [7], to form updated M×P dimensional weight matrix W=M₁(n)W+M₀(n)W₀. This weight matrix is then stored in memory [20], allowing its use as an initial combiner weight matrix in a subsequent adapt block. The weight matrix can also be used in a multiport data-path linear combiner (not shown) for parallel or subsequent data-path processing operations used in the overall system.

DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS

A method for processing digital signals by any adaptive processor (as a single element or set of interacting elements, and whether entirely embodied in physical hardware or in a combined form of hardware, special-purpose firmware, and general processing purpose software applied to effect digital signal processing) that adjusts signal weights on the digital signal(s) transmitted, received, or both, by or through said adaptive processor, in order to optimize an adaptation criteria responsive to a functional purpose or the externalities (transient, temporary, situational, and even permanent) for that processor, is explained. This adaptation criteria for the adaptive algorithm may be any of a signal or parameter estimation, measured quality, or any combination thereof.

This method performs a linear transformation of the processor parameters being adapted from M-dimensions to (M₁+L)-dimensions in each adaptation event, where M₁<<M and L<<M such that M₁ weights are updated without constraints and M₀=M−M₁ weights are subjected to a soft constraints that forces them into an L-dimensional subspace spanned by the those weights at the beginning of the adaptation period. The same dimensionality reduction, using the same linear transformation, is applied to the input data. The reduced-dimensionality weights are then adapted using the same optimization strategy employed by the adaptive processor, except with input data that has also been reduced in dimensionality. In a preferred embodiment the reduced-dimensionality weights are then adapted using exactly the same optimization strategy. In alternative embodiment, as when there exists any of hardware, software, or combined hardware and software differentiation in “adapt-path” operations used to tune the adaptive processor and “data-path” operations used by the adaptive processor during and after tuning, the method will be adapting the reduced-dimensionality weights using substantively the same optimization strategy employed by the adaptive processor for the input data to which the same dimensionality reduction has been applied.

The invention has numerous advantages over the conventional PU approach. These include:

-   -   Substantive reduction or elimination of misadjustment effects         induced by the hard linear constraint employed in the         conventional PU method.     -   Applicability to any optimization function, including functions         based on optimal (maximum-likelihood, maximum a priori,         minimum-mean-square) estimation strategies, and methods such as         analytic constant modulus algorithm (ACMA) and cumulant based         techniques that have very high-order complexity.     -   Ability to reduce adapt block size N significantly, e.g., to         N<M, even when the unconstrained approach experiences         instability issues at that block size.     -   Ability to develop optimization quality measures, e.g.,         Cramer-Rao bound on parameter or signal estimation performance,         that also exploits the dimensionality reduction, and that can         track the performance degradation (relative to the unconstrained         solution) induced by the partial update.     -   Ability to operate with much lower update set sizes that         conventional PU, resulting in further reduction in complexity,         and therefore cost, of adapt-path processing.     -   Ability to be implemented in highly distributed processing         architectures, e.g., general-purpose graphical processing units         (GPGPU's), that can further exploit the reduced complexity of         the approach, or allow processing over multiple parallel update         sets with minimal intercommunication between units.     -   Applicability to other problems where dimensionality is a known         limitation, e.g., pattern recognition over feature sets with         large numbers of parameters.

The approach can be used with any update-set selection strategy developed to date, or with new methods exploiting quality measurement advantages of the approach.

The invention is motivated by interpreting prior-art partial-update approaches as hard-constrained optimization algorithms, in which a complex combiner weight vector w having dimension M×1 is updated to optimize metric F(w;n) over adapt block n, i.e.,

$\begin{matrix} {\left. w\leftarrow{\arg \mspace{14mu} \underset{w^{\prime} \in C^{M}}{opt}\mspace{14mu} {F\left( {w^{\prime};n} \right)}} \right.,} & ({Eq1}) \end{matrix}$

subject to additional linear constraint

(w′

_((n))=(w

_((n)),  (Eq2)

where

₀(n)={m∈{1, . . . , M}: m(1;n), . . . , m(M₀;n)}, referred to here as the block n held-set, is a set of M₀<M indices of weights held constant over adapt block n, and where w in (Eq2) is the combiner weights at the beginning of the adapt block. This resultant constrained optimization criterion can be written in compact matrix algebra as

$\begin{matrix} \left. w\leftarrow{\arg \mspace{14mu} \underset{w^{\prime} \in C^{M}}{opt}\mspace{14mu} \left\{ {{{F\left( {w^{\prime};n} \right)} \ni {{M_{0}(n)}w^{\prime}}} = {{M_{0}(n)}w}} \right\}} \right. & ({Eq3}) \end{matrix}$

where M₀(n)=[e_(M)(m₀)

_((n)) is the M×M₀ sparse held-set projection matrix, and where e_(M) (m₀)=[δ(m−m₀)]_(m=1) ^(M) is the m₀ ^(th) M×1 Euclidean basis vector and δ(k) is the Kronecker delta function. Example prior-art partial-update algorithms that can be expressed in this manner include:

-   -   The partial-update normalized least-mean-squares (PU-NLMS)         algorithm, which modifies the normalized least-mean-squares         (NLMS) algorithm taught in [Nagumo67]

$\begin{matrix} {{{y(n)} = {{x^{T}(n)}w}},} & ({Eq4}) \\ {\left. w\leftarrow{w + {\mu \frac{x^{*}(n)}{{{x(n)}}_{2}^{2}}\left( {{s(n)} - {y(n)}} \right)}} \right.,{0 < \mu \leq 1}} & ({Eq5}) \\ {= {{{\arg \mspace{14mu} {\min\limits_{w^{\prime} \in C^{M}}{{w^{\prime} - w}}_{2}^{2}}} \ni {{x^{T}(n)}w^{\prime}}} = {\hat{s}(n)}}} & ({Eq6}) \\ {{{\hat{s}(n)} = {{\mu \; {s(n)}} + {\left( {1 - \mu} \right){y(n)}}}},{0 < \mu \leq 1}} & ({Eq7}) \end{matrix}$

-   -    at adapt symbol index n, where x(n) is an M×1 data vector         defined over adapt symbol index n, s(n) is a reference scalar         known over adapt symbol index n_(sym), μ is the NLMS adaptive         stepsize, and ∥•∥₂ is the L-2 norm, and where (•)^(T) and (•)*         denote the matrix transpose and complex conjugation operations,         respectively. Addition of the held-set weight-update constraint

M ₀(n)w′=M ₀(n)w,  (Eq8)

-   -    to (Eq6) yields the PU-NLMS algorithm taught in         [Douglas94,Schertler98,Dogancay01],

$\begin{matrix} {{{y(n)} = {{x^{T}(n)}w}},} & ({Eq9}) \\ {{w_{} = {{M_{}^{T}(n)}w}},{ = 0},1} & ({Eq10}) \\ {{x_{} = {{M_{}^{T}(n)}{x(n)}}},{ = 0},1} & ({Eq11}) \\ {\left. w_{1}\leftarrow{w_{1} + {\mu \frac{x_{1}^{*}(n)}{{{x(n)}}_{2}^{2}}\left( {{s(n)} - {y(n)}} \right)}} \right.,{0 < \mu \leq 1},} & ({Eq12}) \\ \left. w\leftarrow{{{M_{1}(n)}w_{1}} + {{M_{0}(n)}w_{0}}} \right. & ({Eq13}) \end{matrix}$

-   -    where M₁(n)=[e_(M)(m₁)         _((n)) is the M×M₁ update-set projection matrix defined over         adapt symbol n, and where         ₁(n)={m∈{1, . . . , M}: m∉         ₀(n)} is the complementary update-set defined over adapt symbol         n.     -   The partial-update affine projections (PU-AP) algorithm, which         modifies the affine projection algorithm taught in         [Ozeki84,Gay93]

$\begin{matrix} {{{y(n)} = {{X(n)}w}},} & ({Eq14}) \\ {\left. w\leftarrow{w + {\mu \; {X^{\dagger}(n)}\left( {{s(n)} - {y(n)}} \right)}} \right.,{0 < \mu \leq 1},} & ({Eq15}) \\ {{= {{{\arg \mspace{14mu} {\min\limits_{w^{\prime} \in C^{M}}{{w^{\prime} - w}}_{2}^{2}}} \ni {{X(n)}w^{\prime}}} = {\hat{s}(n)}}},} & ({Eq16}) \\ {{{\hat{s}(n)} = {{\mu \; {s(n)}} + {\left( {1 - \mu} \right){y(n)}}}},{0 < \mu \leq 1}} & ({Eq17}) \end{matrix}$

-   -    over N-symbol adapt block n, 1≦N≦M, where X(n)=[x(nN+1) . . .         x(nN+N)]^(T) is an N×M data matrix defined over adapt block n,         s(n)=[s(nN+1) . . . s(nN+N)]^(T) is a known N×1 reference vector         defined over adapt block n, μ is an adaptive stepsize, and         (•)^(†) denotes the matrix pseudoinverse operation, given by

X ^(†)(n)=X ^(H)(n)(X(n)X ^(H)(n))⁻¹,  (Eq18)

-   -    for rank{X(n)}=N≦M, and where (•)^(H) and (•)⁻¹ denote the         matrix conjugate-tranpose (Hermitian) and inverse operations,         respectively. Addition of constraint (Eq8) to (Eq16) yields the         PU-AP taught in [Naylor04],

y(n)=X(n)w,  (Eq19)

w _(l) =M _(l) ^(T)(n)w, l=0,1,  (Eq20)

X _(l)(n)=X(n)M _(l)(n), l=0,1,  (Eq21)

w ₁ ←w ₁ +μX ₁ ^(†)(n)(s(n)−y(n)), 0<μ≦1,  (Eq22)

w←M ₁(n)w ₁ +M ₀(n)w ₀.  (Eq23)

-   -   The partial-update block least-squares (PU-BLS) algorithm, which         modifies the block least-squares (BLS) algorithm given by

$\begin{matrix} {{{y(n)} = {{X(n)}w}},} & ({Eq24}) \\ {\left. w\leftarrow{{\left( {1 - \mu} \right)w} + {\mu \; {X^{\dagger}(n)}{s(n)}}} \right.,} & ({Eq25}) \\ {{= {\arg \mspace{14mu} {\min\limits_{w^{\prime} \in C^{M}}{{{s^{\prime}(n)} - {{X(n)}w^{\prime}}}}_{2}^{2}}}},\left\{ \begin{matrix} {{\hat{s}(n)} = {{\mu \; {\hat{s}(n)}} + {\left( {1 - \mu} \right){y(n)}}}} \\ {{y(n)} = {{X(n)}w}} \end{matrix} \right.} & ({Eq26}) \end{matrix}$

-   -    over N-symbol adapt block n, N≧M, where ∥•∥₂ is the L-2 norm         and μ is the AP adaptive stepsize, and where X(n)=[x(nN+1) . . .         x(nN+N)]^(T) is an N×M data matrix defined over adapt block n,         s(n)=[s(nN+1) . . . s(nN+N)]^(T) is a known N×1 reference vector         defined over adapt block n, and (•)^(†) denotes the matrix         pseudoinverse operation, given by

X ^(†)(n)=(X ^(H)(n)X(n))⁻¹ X ^(H)(n)  (Eq27)

-   -    for rank{X(n)}=M≦N. Addition of constraint (Eq8) to (Eq26)         yields PU-BLS algorithm

y(n)=X(n)w,  (Eq28)

w _(l) =M _(l) ^(T)(n)w, l=0,1,  (Eq29)

X _(l)(n)=X(n)M _(l)(n), l=0,1,  (Eq30)

w ₁←(1−μ)w ₁ +μX ₁ ^(†)(n)(s(n)−y(n)), 0<μ≦1,  (Eq31)

w←M ₁(n)w ₁ +M ₀(n)w ₀.  (Eq32)

The BLS and PU-BLS algorithms can be interpreted as extensions of the AP and PU-AP algorithms to adapt block sizes N≧M. Similarly, the NLMS and PU-NLMS algorithms can be interpreted as implementations of the AP and PU-AP algorithms for N=1.

A number of observations can immediately be made from this interpretation of the partial-update procedure. First, any linear constraint can induce severe misadjustment from the optimal solution sought by the processor. This can manifest as both a convergent or steady-state bias from the optimal solution, and a “jitter” or fluctuation about that steady-state solution. In some applications, e.g., phased array radar applications where the received radar waveform must be extracted from strong clutter and jamming, this can cause the system to fail entirely. Even if the reference signal is received at high SINR, this can lead to well-known “hypersensitivity” issues that degrade system performance from the optimal solution.

Second, the linear constraint can only be easily added to a small subset of optimization functions. In many cases, strict enforcement of the constraint significantly increases complexity of the original method.

The subspace-constrained approach overcomes both of these problems, by replacing the hard linear constraint M₀(n)w′=M₀(n)w with a softer subspace constraint

M ₀ ^(T)(n)w′∝M ₀ ^(T)(n)w

=M ₀ ^(T)(n)wg ₀ , g ₀∈

,  (Eq33)

where the scalar held-set multiplier g₀ and the update-set weights w₁=M₁(n)w are jointly adjusted to optimize the unconstrained criterion given in (Eq1), i.e., by adapting (M₁+1)×1 enhanced weight vector

$\overset{\sim}{w} = \begin{pmatrix} w_{1} \\ g_{0} \end{pmatrix}$

using optimization formula

$\begin{matrix} \left. \overset{\sim}{w}\leftarrow{\arg \mspace{14mu} \underset{{\overset{\sim}{w}}^{\prime} \in {\mathbb{C}}^{M_{1} + 1}}{opt}{F\left( {{\overset{\sim}{w}}^{\prime};n} \right)}} \right. & \left( {{Eq}\mspace{14mu} 34} \right) \end{matrix}$

over each data block. The full output weight vector is then given by

w=M ₁(n)w ₁ +M ₀(n)w ₀ g ₀,  (Eq35)

-   -   which is efficiently computed using vector-scalar multiplies and         multiply-free multiplexing (MUX) operations.

For the exemplary NLMS, AP, and BLS optimization criteria given in (Eq6), (Eq16), and (Eq26), respectively, the SCPU algorithms are implemented using the following procedure:

-   -   Separate w into update-set and held-set components w₁ and w₀         using multiply-free demultiplexing (DMX) operations.     -   For the SCPU-AP/NLMS algorithms, and for the SCPU-BLS algorithm         with μ<1, construct (M₁+1)×1 dimensional weight matrix

$\begin{matrix} {\overset{\sim}{w} = {\begin{pmatrix} w_{1} \\ 1 \end{pmatrix}.}} & \left( {{Eq}\mspace{14mu} 36} \right) \end{matrix}$

-   -   Separate X(n) into update-set and held-set components X₁(n) and         X₀(n) using multiply-free columnar DMX operations.     -   Compute y₀(n)=X₀(n)w₀. For the SCPU-AP/NLMS algorithms, further         compute y₁(n)=X₁(n)w₁ and y(n)=y₁(n)+y₀(n).     -   Construct N×(M₁+1) dimensional SCPU data matrix

{tilde over (X)}(n)=[X ₁(n)y ₀(n)].  (Eq37)

-   -    Note that y(n)={tilde over (X)}(n){tilde over (w)} constructs         the output data from the prior weight set.     -   Optimize {tilde over (w)} using the original unconstrained         algorithm, with dimensionality reduced from M to M₁+1, yielding

SCPU-AP: {tilde over (w)}←{tilde over (w)}+μ{tilde over (X)} ^(†)(n)(s(n)−y(n)), 0<μ≦1  (Eq38)

SCPU-BLS: {tilde over (w)}←(1−μ){tilde over (w)}+μ{tilde over (X)} ^(†)(n)s(n), 0<μ≦1,  (Eq39)

-   -    where the SCPU-AP algorithm degenerates to SCPU-NLMS if N=1,         and extends to SCPU-BLS if N≧M₁.     -   Update w₁ and w₀ using formula

w ₁←[({tilde over (w)})_(m)]_(m=1) ^(M) ¹   (Eq40)

w ₀ ←g ₀ w ₀ , g ₀=({tilde over (w)})_(M) ₁ ₊₁,  (Eq41)

-   -    where ({tilde over (w)})_(m) denotes the m^(th) element of         vector {tilde over (w)}.     -   Reconstruct the linear combiner weights using.

w=M ₁(n)w ₁ +M ₀(n)w ₀.  (Eq42)

The SCPU approach employs a data matrix with a nominal dimensionality increase of one over the equivalent PU data matrix, and requires an additional M₀ complex multiplies to update the held-set weight vector. This complexity increase is substantive for the PU-NLMS algorithm, which has O(M) complexity on the data path and O(M₁) complexity on the adapt path. However, this complexity increase is minor for the PU-BLS algorithm, which has O(M₁ ²) complexity on the adapt path, and for the PU-AP algorithm if the adapt block size N is less than but on the order of the number of updated weights M₁ (N≲M₁). Because the SCPU approach optimizes g₀=({tilde over (w)})_(M) ₁ ₊₁ over the complex field, the algorithm is also guaranteed to have lower misadjustment than the equivalent PU method, which constrains g₀ to unity.

This implementation of a SCPU algorithm obtains a higher degree of efficiency (in comparison with either an unconstrained partial update, or full update algorithm) through reducing the level of repetitive processing and comparison which is needed to obtain the maximally-beneficial level, and mixture, of signal weightings that, when applied to the next processing effort, will produce the correct answer within the noise constraints. If applied so as to remove arbitrarily-imposed limits on either the processing depth, or on the number of criteria to be evaluated, then a satisficing level of accuracy can be reached without sacrificing the capacities which were otherwise artificially constrained. Since the weighting dimensionality is reduced by and to the level of the constraints on the subspace, without changing the data path, the efficiency of the transforming process is improved over the full analytical processing effort.

The SCPU algorithm employs a data matrix with a nominal dimensionality increase of one over the equivalent partial-update (PU) data matrix, and which employs an additional O(M₀) complex scalar-vector multiplier to update the held-set weight vector. This can be expressed in the following compact matrix notation:

$\begin{matrix} {{\overset{\sim}{M}(n)} = \begin{bmatrix} {M_{1}(n)} & {{M_{0}(n)}{M_{0}^{T}(n)}w} \end{bmatrix}} & \left( {{Eq}\mspace{14mu} 43} \right) \\ {{\overset{\sim}{x}(n)} = {{X(n)}{\overset{\sim}{M}(n)}}} & \left( {{Eq}\mspace{14mu} 44} \right) \\ {\overset{\sim}{w} = {\arg \mspace{14mu} \underset{\overset{\sim}{w} \in {\mathbb{C}}^{M_{1} + 1}}{opt}{F\left( {\overset{\sim}{w};n} \right)}}} & \left( {{Eq}\mspace{14mu} 45} \right) \\ {{y(n)} = {{\overset{\sim}{X}(n)}\overset{\sim}{w}}} & \left( {{Eq}\mspace{14mu} 46} \right) \\ {\left. w\leftarrow{{\overset{\sim}{M}(n)}\overset{\sim}{w}} \right.,} & \left( {{Eq}\mspace{14mu} 47} \right) \end{matrix}$

where {tilde over (M)}(n) is an M×(M₁+1) sparse mapping matrix than reduces dimensionality of X(n) ahead of the optimization algorithm described symbolically in (Eq45).

This compact notation reveals some additional advantages of the approach:

-   -   The approach is inherently more stable than the unconstrained         algorithm on a block-by-block basis, because it updates fewer         weights than the unconstrained method, without introducing         explicit hard constraints that lead to adaptive “jitter.”         Hypersensitivity effects due to large noise subspaces in the         received data should be especially reduced in the SCPU method.     -   The approach is usable with any optimization criterion,         including non-quadratic criteria such as general and analytic         constant modulus cost functions [Treichler83,Agee86,Van Der Veen         96], cumulant based objective functions, and eigenvalue-based         objective functions [Agee89b,Agee90].     -   The approach admits both SCPU maximum-likelihood signal and         parameter estimation approaches, and reduced-complexity,         constrained quality metrics such as         signal-to-interference-and-noise ratio (SINR), Cramer-Rao bounds         on parameter estimates, and information-theoretic channel         capacity. These metrics may lead to new update-set selection         strategies that can overcome identified issues with methods         developed to date.

The mapping given in (Eq43) can be extended in many ways to enhance other attributes of the algorithm, e.g., ability to track multiple signals, new selection strategies, and so on. In particular, the approach immediately yields nonblind multipart extensions in which adaptation algorithms are used to extract multiple signals from a received environment.

Two multipart extensions of the AP and BLS methods are taught here based on the unconstrained nonblind algorithms given by

AP: W←W+μX ^(†)(n)(S(n)−Y(n)), 0<μ≦1  (Eq48)

BLS: W←(1−μ)W+μX ^(†)(n)S(n), 0<μ≦1,  (Eq49)

where W is an M×P combiner matrix, Y(n)=X(n)W is an N×P matrix of combiner output data formed over adapt block n using W, and S(n) is an N×P matrix of reference data known over adapt block n. These multipart extensions include the following:

-   -   An uncoupled multiport extension in which (Eq43) is replaced by         P separate mapping matrices

{tilde over (M)}(n;p)=[M ₁(n)M ₀(n)M ₀ ^(T)(n)w(p)], p=1, . . . ,P,  (Eq50)

{tilde over (X)}(n;p)=X(n){tilde over (M)}(n;p), p=1, . . . ,P,  (Eq51)

-   -    i.e., the SCPU constraint (Eq33) is broadened to P separate         constraints

M ₀ ^(T)(n){tilde over (w)}(p)=M ₀ ^(T)(n)w(p)g ₀(p), g ₀(p)∈

, p=1, . . . ,P.  (Eq52)

-   -    The uncoupled SCPU-BLS algorithm is then given by

$\begin{matrix} {{\overset{\sim}{X}\left( {n;p} \right)} = {{X(n)}{\overset{\sim}{M}\left( {n;p} \right)}}} & \left( {{Eq}\mspace{14mu} 53} \right) \\ {{\overset{\sim}{w}(p)} = \begin{pmatrix} {{M_{1}^{T}(n)}{w(p)}} \\ 1 \end{pmatrix}} & \left( {{Eq}\mspace{14mu} 54} \right) \\ {\left. {\overset{\sim}{w}(p)}\leftarrow{{\left( {1 - \mu} \right){\overset{\sim}{w}(p)}} + {\mu \; {{\overset{\sim}{X}}^{\dagger}\left( {n;p} \right)}{s\left( {n;p} \right)}}} \right.,\mspace{14mu} {0 \leq \mu \leq 1}} & \left( {{Eq}\mspace{14mu} 55} \right) \\ {{y\left( {n;p} \right)} = {{\overset{\sim}{X}\left( {n;p} \right)}{\overset{\sim}{w}(p)}}} & \left( {{Eq}\mspace{14mu} 56} \right) \\ \left. {w(p)}\leftarrow{{\overset{\sim}{M}\left( {n;p} \right)}{\overset{\sim}{w}(p)}} \right. & \left( {{Eq}\mspace{14mu} 57} \right) \end{matrix}$

-   -    for each port p=1, . . . , P where s(n;p) and w(p) are the         p^(th) column of S(n) and W, respectively, and where (Eq54) is         only needed if μ<1.     -   A fully-coupled multiport extension, in which (Eq43) is replaced         by global mapping matrix

{tilde over (M)}(n)=[M ₁(n)M ₀(n)M ₀ ^(T)(n)W],  (Eq58)

-   -    i.e., the SCPU constraint (Eq33) is broadened to

M ₀ ^(T)(n)W′=M ₀ ^(T)(n)WG ₀ , G ₀∈

^(P×P).  (Eq59)

-   -    The fully-coupled SCPU-BLS algorithm is then given by

$\begin{matrix} {{{\overset{\sim}{X}(n)} = {{X(n)}{\overset{\sim}{M}(n)}}},} & \left( {{Eq}\mspace{14mu} 60} \right) \\ {\overset{\sim}{W} = \begin{pmatrix} {{M_{1}^{T}(n)}W} \\ I_{P} \end{pmatrix}} & \left( {{Eq}\mspace{14mu} 61} \right) \\ {\left. \overset{\sim}{W}\leftarrow{{\left( {1 - \mu} \right)\overset{\sim}{W}} + {\mu \; {{\overset{\sim}{X}}^{\dagger}(n)}{S(n)}}} \right.,\mspace{14mu} {0 < \mu \leq 1}} & \left( {{Eq}\mspace{14mu} 62} \right) \\ {{Y(n)} = {{\overset{\sim}{X}(n)}\overset{\sim}{W}}} & \left( {{Eq}\mspace{14mu} 63} \right) \\ {\left. W\leftarrow{{\overset{\sim}{M}(n)}\overset{\sim}{W}} \right.,} & \left( {{Eq}\mspace{14mu} 64} \right) \end{matrix}$

and where (Eq61) is only needed if μ<1.

In an efficient embodiment, the uncoupled multiport SCPU-BLS extension is implemented using whitening methods that exploit the common components of {{tilde over (X)}(n;p)}_(p=1) ^(P), i.e., the N×M₁ dimensional update-set data matrix X₁(n)=X(n)M₁(n).

In particular, using the QR decomposition of {tilde over (X)}(n;p), given by

$\begin{matrix} {{\left\{ {Q,R} \right\} = {{QRD}(X)}},\mspace{31mu} \left\{ \begin{matrix} {R = {{chol}\left\{ {X^{H}X} \right\}}} \\ {Q = {XR}^{- 1}} \end{matrix} \right.} & \left( {{Eq}\mspace{14mu} 65} \right) \\ \left. \Rightarrow\left\{ \begin{matrix} {X = {QR}} \\ {{Q^{H}Q} = I_{N}} \end{matrix} \right. \right. & \left( {{Eq}\mspace{14mu} 66} \right) \end{matrix}$

for general N×M matrix X with rank{X}=N≧M, where I_(N) is the N×N identity matrix and chol{•} is the Cholesky decomposition yielding upper-triangular matrix R with real-positive diagonal values, then the upcoupled multiport SCPU-BLS algorithm given in (Eq55) can be efficiently implemented by first computing the QRD of the common update-set data matrix,

{Q ₁ ,R ₁₁ }=QRD(X ₁(n)),  (Eq67)

and then updating each port p using the recursion

$\begin{matrix} \left. y_{0}\leftarrow{{X_{0}(n)}{w_{0}(p)}} \right. & \left( {{Eq}\mspace{14mu} 68} \right) \\ \left. r_{10}\leftarrow{Q_{1}^{H}y_{0}} \right. & \left( {{Eq}\mspace{14mu} 69} \right) \\ \left. u_{1}\leftarrow{Q_{1}^{H}{s\left( {n;p} \right)}} \right. & \left( {{Eq}\mspace{14mu} 70} \right) \\ \left. {g_{0}(p)}\leftarrow\frac{{y_{0}^{H}{s\left( {n;p} \right)}} - {r_{10}^{H}u_{1}}}{{y_{0}}_{2}^{2} - {r_{10}}_{2}^{2}} \right. & \left( {{Eq}\mspace{14mu} 71} \right) \\ {\left. {w_{1}(p)}\leftarrow{{\left( {1 - \mu} \right){w_{1}(p)}} + {\mu \; {R_{11}^{- 1}\left( {u_{1} - {r_{10}{g_{0}(p)}}} \right)}}} \right.,{0 < \mu \leq 1}} & \left( {{Eq}\mspace{14mu} 72} \right) \\ {\left. {g_{0}(p)}\leftarrow{\left( {1 - \mu} \right) + {\mu \; {g_{0}(p)}}} \right.,\mspace{14mu} {0 < \mu \leq 1}} & \left( {{Eq}\mspace{14mu} 73} \right) \end{matrix}$

where

${\overset{\sim}{w}(p)} = {\begin{pmatrix} {w_{1}(p)} \\ {g_{0}(p)} \end{pmatrix}.}$

This recursion also admits unbiased quality statistic

$\begin{matrix} {{{{\overset{\sim}{\gamma}}_{\max}\left( {n;p} \right)} = {{\left( {1 - \frac{M_{1} + 1}{N}} \right)\left( \frac{\overset{\sim}{\eta}}{1 - \overset{\sim}{\eta}} \right)} - \frac{M_{1} + 1}{N}}},{\overset{\sim}{\eta} = \frac{{u_{1}}_{2}^{2} + {u_{0}}_{2}^{2}}{{{s\left( {n;p} \right)}}_{2}^{2}}}} & \left( {{Eq}\mspace{14mu} 74} \right) \end{matrix}$

for each port p, which estimates the relative power between the port p reference signal and background clutter at the output of the port p linear combiner, also referred to as the signal-and-interference-and-noise ratio (SINR) of the combiner output signal.

The SCPU method is also easily extended to partially blind methods in which the reference vector s(n) is partially known at the receive processor over adapt block n, e.g., the reference vector has an unknown carrier or timing offset relative to the sequence contained in the input data sequence, and to fully blind methods in which the reference vector is unknown but has some known, exploitable structure. Specific examples include:

-   -   Carrier-timing tracking SCPU-BLS algorithms, in which s(n) has         an unknown timing and/or carrier offset, e.g., due to         propagation delay, Doppler shift and carrier LO uncertainty         between the input data and an original transmitted signal         containing the reference signal, or a combined frequency shift         due to timing and carrier offset if the input data is derived         from an OFDM or OFDMA demodulation process. This algorithm         replaces the nonblind weight adaptation algorithm given in         (Eq39) with

{tilde over (w)}←(1−μ){tilde over (w)}+μ{tilde over (X)} ^(†)(n)(s({circumflex over (n)} _(off) ;n)∘δ({circumflex over (ω)}_(off)))  (Eq75)

s(n _(off) ;n)=[s(nN+n _(sym) +n _(off))_(n) _(sym) ₌₁ ^(N)]  (Eq76)

δ(ω_(off))=[e ^(jω) ^(off) ^(n) ^(sym) ]_(n) _(sym) ₌₁ ^(N)  (Eq77)

-   -    where {s(n_(sym))} is a component of the transmitted signal         that is known over the adapt block except for timing offset         n_(off) and a carrier offset ω_(off), and where “∘” is the         element-wise matrix multiplication operation. The timing and         carrier offset can be optimized over each adapt block by setting

$\begin{matrix} {\mspace{79mu} {{\left\{ {{{\hat{\omega}}_{off}(n)},{{\hat{n}}_{off}(n)}} \right\} = {\arg {\max\limits_{\omega_{off},n_{off}}{\eta \left( {\omega_{off},{n_{off};n}} \right)}}}},}} & \left( {{Eq}\mspace{14mu} 78} \right) \\ \begin{matrix} {{{\eta \left( {\omega_{off},{n_{off};n}} \right)} = \frac{{{{{\overset{\sim}{Q}}^{H}(n)}\left( {{s\left( {n_{off};n} \right)}{{\bullet\delta}\left( \omega_{off} \right)}} \right)}}_{2}^{2}}{{{s\left( {n_{off};n} \right)}}_{2}^{2}}},} \\ {{= \frac{{{\sum\limits_{n_{sym} = 1}^{N}\; {{{\overset{\sim}{q}}^{*}\left( n_{sym} \right)}{s\begin{pmatrix} {{nN} +} \\ {n_{sym} +} \\ n_{off} \end{pmatrix}}^{{j\omega}_{off}n_{sym}}}}}_{2}^{2}}{\sum\limits_{n_{sym} = 1}^{N}\; {{s\left( {{nN} + n_{sym} + n_{off}} \right)}}^{2}}},\left( {{Eq}\mspace{14mu} 80} \right)} \end{matrix} & \left( {{Eq}\mspace{14mu} 79} \right) \end{matrix}$

-   -    where {tilde over (Q)}=[{tilde over (q)}(1) . . . {tilde over         (q)}(N)]^(T) is the Q-component of the QRD of {tilde over         (X)}(n). Equation (Eq78) can be efficiently implemented using         fast Fourier transform (FFT) methods if the frequency offset ω         is completely unknown (acquisition phases), and using         Gauss-Newton or Newton methods if the frequency offset ω is         known closely (tracking phases).     -    Equation (Eq78) also admits quality statistic

$\begin{matrix} {{{\overset{\sim}{\gamma}\left( {\omega_{off},{n_{off};n}} \right)} = {{\left( {1 - \frac{M_{1} + 1}{N}} \right)\left( \frac{\overset{\sim}{\eta}\left( {\omega_{off},{n_{off};n}} \right)}{1 - {\overset{\sim}{\eta}\left( {\omega_{off},{n_{off};n}} \right)}} \right)} - \frac{M_{1} + 1}{N}}},} & \left( {{Eq}\mspace{14mu} 81} \right) \end{matrix}$

-   -    which estimates the SINR of the combiner output signal.     -   Property-mapping SCPU-BLS algorithms, in which s(n) is a member         of a known property set. These algorithms replace the nonblind         weight adaptation algorithm given in (Eq39) with         property-mapping recursion

$\begin{matrix} {{y(n)} = {{\overset{\sim}{X}(n)}\overset{\sim}{w}}} & \left( {{Eq}\mspace{14mu} 82} \right) \\ {{\hat{s}(n)} = {\arg {\min\limits_{s \in {{(n)}}}{{s - {y(n)}}}}}} & \left( {{Eq}\mspace{14mu} 83} \right) \\ {\left. \overset{\sim}{w}\leftarrow{{\left( {1 - \mu} \right)\overset{\sim}{w}} + {\mu \; {{\overset{\sim}{X}}^{\dagger}(n)}{\hat{s}(n)}}} \right.,} & \left( {{Eq}\mspace{14mu} 84} \right) \end{matrix}$

-   -    where         (n) is a desired signal set, potentially variable as a function         of adapt block n, that s(n) is known to belong to. For example,         the constant modulus property set:         (n)={z∈         ^(N):|(z)_(n)|=1} yields

{circumflex over (s)}(n)=sgn{y(n)}  (Eq85)

-   -    where sgn{•} is the element-wise complex sign function         sgn{z}=z/|z| on each element, resulting in an SCPU-BLS         constant-modulus algorithm. Other exemplary mappings include         known modulus mappings in which the elements of s(n) have known         magnitude but unknown phase, and decision-direction mappings in         which each element of s(n) belongs to a known set of finite         values, possibly with an unknown carrier offset.     -    In all cases, the property-mapping algorithm is applicable to         cases in which s(n) does not perfectly possess the property used         by the algorithm, but substantively conforms to that property,         e.g., |s(nN+n_(sym))|≈1.     -   Dominant-mode prediction (DMP) algorithms, in which s(n) is         known to be substantively present in a linear subspace with         known or estimable structure, such that

s(n)≈(U _(s)(n)U _(s) ^(H)(n))s(n)  (Eq86)

-   -    for a known or postulated N×N_(s)(n) orthonormal basis         U_(s)(n), N_(s)(n)<N, and/or such that s(n) is known to be         substantively absent a linear subspace with known or estimable         structure, such that

U _(⊥) ^(H)(n)s(n)≈0  (Eq87)

-   -    for a known or postulated complementary N×N_(⊥)(n) orthonormal         basis U_(⊥)(n), in which N_(s)(n)+N_(⊥)(n)≦N. If only one         subspace is available, one can be derived from the other, for         example by deriving U_(⊥)(n) from I_(N)−(U_(s)(n)U_(s) ^(H)(n))         or vice verse.     -    In this case, the enhanced weight update algorithm is given by

$\begin{matrix} {{\overset{\sim}{w} = {\arg {\max\limits_{w \in {\mathbb{C}}^{M_{1} + 1}}{\overset{\sim}{\gamma}\left( {w;n} \right)}}}},} & \left( {{Eq}\mspace{14mu} 88} \right) \\ {{{\overset{\sim}{\gamma}\left( {w;n} \right)} = \frac{{w^{H}\left( {{{\overset{\sim}{X}}_{s}^{H}(n)}{{\overset{\sim}{X}}_{s}(n)}} \right)}w}{{w^{H}\left( {{{\overset{\sim}{X}}_{\bot}^{H}(n)}{{\overset{\sim}{X}}_{\bot}(n)}} \right)}w}},\left\{ {\begin{matrix} {{{\overset{\sim}{X}}_{s}(n)} = {{U_{s}^{H}(n)}{\overset{\sim}{X}(n)}}} \\ {{{\overset{\sim}{X}}_{\bot}(n)} = {{U_{\bot}^{H}(n)}{\overset{\sim}{X}(n)}}} \end{matrix}.} \right.} & \left( {{Eq}\mspace{14mu} 89} \right) \end{matrix}$

-   -    The enhanced combiner weights {tilde over (w)}_(max) that         maximize (Eq89), and the maximal value of (Eq89), {tilde over         (γ)}_(max), are equal to the dominant solution {{tilde over         (γ)}₁,{tilde over (w)}₁} of the DMP eigenequation,

{tilde over (γ)}_(m)({tilde over (X)} _(⊥) ^(H)(n){tilde over (X)} _(⊥)(n)){tilde over (w)} _(m)=({tilde over (X)} _(s) ^(H)(n){tilde over (X)} _(s)(n)){tilde over (w)} _(m), {tilde over (γ)}_(m)≧{tilde over (γ)}_(m+1).  (Eq90)

-   -    The dominant eigenvalue {tilde over (γ)}₁ also provides an         estimate of the SINR of the combiner output signal, and can be         used both to detect the target signal, and to search over         postulated subspaces to find the subspace that most closely         contains or rejects s(n).     -    Example subspaces include:         -   Known or postulated time slots used by s(n), such that

${{s\left( {{nN} - n_{sym}} \right)}}{\frac{1}{N}{{s(n)}}_{2}}$

-   -    over some known or searchable subset of symbol indices within         the adapt block. This generates SCPU time-gated DMP (TG-DMP)         algorithms.         -   Known or postulated frequency channels used by s(n), such             that

$\begin{matrix} {\frac{{{\sum\limits_{n_{sym} = 1}^{N - 1}\; {{h\left( n_{sym} \right)}{s\left( {{nN} + n_{sym}} \right)}^{{- {j\omega}}\; n_{sym}}}}}^{2}}{\sum\limits_{n_{sym} = 1}^{N - 1}\; {{h^{2}\left( n_{sym} \right)}{\sum\limits_{n_{sym} = 1}^{N - 1}\; {{h\left( n_{sym} \right)}{{s\left( {{nN} + n_{sym}} \right)}}^{2}}}}}1} & \left( {{Eq}\mspace{14mu} 91} \right) \end{matrix}$

-   -    over a known or searchable subset of frequency offsets {ω_(⊥)},         where {h(n_(sym))}_(n) _(sym) ₌₁ ^(N) is a lowpass windowing         function, e.g., a Gaussian or Hamming window. This generates         SCPU frequency-gated DMP (FG-DMP) algorithms.         -   Known or postulated CDMA codes used by s(n), such that

$\begin{matrix} {\frac{{{\sum\limits_{n_{sym} = 1}^{N - 1}\; {{c^{*}\left( {n_{sym} - n_{off}} \right)}{s\left( {{nN} + n_{sym}} \right)}^{{- {j\omega}_{off}}n_{sym}}}}}^{2}}{\sum\limits_{n_{sym} = 1}^{N - 1}\; {{{c\left( {n_{sym} - n_{off}} \right)}}^{2}{\sum\limits_{n_{sym} = 1}^{N - 1}\; {{s\left( {{nN} - n_{sym}} \right)}}^{2}}}} \approx 1} & \left( {{Eq}\mspace{14mu} 92} \right) \end{matrix}$

-   -    over a known or searchable subset of carrier offsets to         {ω_(off)} and timing offsets {n_(off)}, where {c(n_(sym))} is a         known spreading code. This generates SCPU code-gated DMP         (CG-DMP) algorithms.         -   Known or postulated restricted isometry properties (RIP)             possessed by s(n), such that it occupies a sparse subset of             a basis U_(s) that is known (oracular basis), or that             satisfies some sparsity property (general RIP). This             generates adaptive decompression algorithms in compressed             sensing applications.     -   Conjugate self-coherence restoral (C-SCORE) algorithms, in which         s(n) is known to have substantive conjugate self-coherence at         some known or estimable frequency offset ω, such that

$\begin{matrix} {\frac{{\sum\limits_{n_{sym} = 1}^{N}\; {{s^{2}\left( {{nN} + n_{sym}} \right)}^{{- {j\omega}}\; n_{sym}}}}}{\sum\limits_{n_{sym} = 1}^{N}\; {{s\left( {{nN} - n_{sym}} \right)}}^{2}} \approx 1} & \left( {{Eq}\mspace{14mu} 93} \right) \end{matrix}$

-   -    In this case, the enhanced weight update algorithm is given by

$\begin{matrix} {{\overset{\sim}{w} = {\arg {\max\limits_{w \in {\mathbb{C}}^{M_{1} + 1}}{\overset{\sim}{\rho}\left( {\left. w \middle| \omega \right.;n} \right)}}}},} & \left( {{Eq}\mspace{14mu} 94} \right) \\ {{{\overset{\sim}{\rho}\left( {\left. w \middle| \omega \right.;n} \right)} = \frac{{{w^{H}\left( {{{\overset{\sim}{X}}^{H}(n)}{\Delta (\omega)}{{\overset{\sim}{X}}^{*}(n)}} \right)}w^{*}}}{{w^{H}\left( {{{\overset{\sim}{X}}^{H}(n)}\overset{\sim}{X}(n)} \right)}w}},{{\Delta (\omega)} = {{diag}\left\{ ^{{j\omega}\; n_{sym}} \right\}_{n_{sym} = 1}^{N}}}} & \left( {{Eq}\mspace{14mu} 95} \right) \end{matrix}$

-   -    for a postulated twice-carrier offset ω. The enhanced combiner         weights {tilde over (w)}_(max) that maximize (Eq95), and the         maximal value of (Eq89), {tilde over (ρ)}_(max)(ω;n), are equal         to the dominant solution {{tilde over (ρ)}₁(ω),{tilde over         (w)}₁(ω)} of the C-SCORE pseudo-eigenequation,

{tilde over (ρ)}_(m)(ω)({tilde over (X)} ^(H)(n){tilde over (X)}((n)){tilde over (w)} _(m)(ω)=({tilde over (X)} ^(H)(n)Δ(ω){tilde over (X)}*(n)){tilde over (w)} _(m)*(ω), {tilde over (ρ)}_(m)≧{tilde over (ρ)}_(m+1).  (Eq96)

-   -    The SC-PU C-SCORE algorithm is expected to have application to         BPSK, MSK, and GMSK signals, such as 1 Mbps (BPSK) 802.11 DSSS         signal. The algorithm also extends to both carrier-tracking         algorithms where an FFT-based search algorithm. In this case,         the line spectrum used to detect the SOI's will either be the         dominant pseudoeigenmode {tilde over (ρ)}_(max)(ω;n).

Extensions of all of these algorithms to fully-coupled and uncoupled multiport SCPU methods is straightforward.

It should also be recognized that, while all of the techniques described here are defined over the “complex field,” such that w∈

^(M), they are equally applicable combiners and optimization metrics defined over other fields, including the real field, e.g., w∈

^(M), and Galois fields usable in integer field codes. In each case, the subspace constraint

M ₀ ^(T)(n)w′∝M ₀ ^(T)(n)w

=M ₀ ^(T)(n)wg ₀ , g ₀∈

,  (Eq97)

where

is the field in which each element of w is defined, results in a valid SCPU method. The method is also applicable to linear-conjugate-linear (LCL) methods

$\begin{matrix} {{{{{\overset{\_}{M}}_{0}^{T}(n)}{\overset{\_}{w}}^{\prime}} \propto {{{\overset{\_}{M}}_{0}^{T}(n)}\overset{\_}{w}}},\left\{ {{\begin{matrix} {\overset{\_}{w} = {\frac{1}{\sqrt{2}}\begin{pmatrix} w \\ w^{*} \end{pmatrix}}} \\ {{M_{}(n)} = \left\lbrack {{M_{}(n)}\mspace{31mu} {M_{}(n)}} \right\rbrack} \end{matrix} = {{{\overset{\_}{M}}_{0}^{T}(n)}\overset{\_}{w}{\overset{\_}{g}}_{0}}},\mspace{31mu} {{\overset{\_}{g}}_{0} = {\frac{1}{\sqrt{2}}\begin{pmatrix} g_{0} \\ g_{0}^{*} \end{pmatrix}}},} \right.} & \left( {{Eq}\mspace{14mu} 98} \right) \end{matrix}$

-   -   which allows the SCPU method to be applied to optimization         functions that are more complicated functions of complex         variables. Moreover the techniques are applicable to processors         that implement nonlinear functions on the data path as well as         the adapt path, if the original optimization constraint is a         linear function of w. In a preferred embodiment the step of         performing a dimensionality reduction comprising a linear         transformation of the processor parameters being adapted from         M-dimensions to (M₁+L)-dimensions in each adaptation event         comprises applying a subspace constraint described in the form:

M ₀ ^(T)(n)w′∝M ₀ ^(T)(n)w  (Eq. 99)

=M ₀ ^(T)(n)wg ₀.  (Eq. 100)

While this invention is susceptible of embodiment in many different forms, there is shown in the drawings and will herein be described in detail several specific embodiments with the understanding that the present disclosure is to be considered as an exemplification of the principles of the invention and is not intended to limit the invention to the embodiments illustrated.

Some of the above-described functions may be composed of instructions, or depend upon and use data, that are stored on storage media (e.g., computer-readable medium). The instructions and/or data may be retrieved and executed by the processor. Some examples of storage media are memory devices, tapes, disks, and the like. The instructions are operational when executed by the processor to direct the processor to operate in accord with the invention; and the data is used when it forms part of any instruction or result therefrom.

The terms “computer-readable storage medium” and “computer-readable storage media” as used herein refer to any medium or media that participate in providing instructions to a CPU for execution. Such media can take many forms, including, but not limited to, non-volatile (also known as ‘static’ or ‘long-term’) media, volatile media and transmission media. Non-volatile media include, for example, one or more optical or magnetic disks, such as a fixed disk, or a hard drive. Volatile media include dynamic memory, such as system RAM or transmission or bus ‘buffers’. Common forms of computer-readable media include, for example, a floppy disk, a flexible disk, a hard disk, magnetic tape, any other magnetic medium, a CD-ROM disk, digital video disk (DVD), any other optical medium, any other physical medium with patterns of marks or holes.

“Memory”, as used herein when referencing to computers, is the functional hardware that for the period of use retains a specific structure which can be and is used by the computer to represent the coding, whether data or instruction, which the computer uses to perform its function. Memory thus can be volatile or static, and be any of a RAM, a PROM, an EPROM, an EEPROM, a FLASHEPROM, any other memory chip or cartridge, a carrier wave, or any other medium from which a computer can read data, instructions, or both.

“I/O”, or ‘input/output’, is any means whereby the computer can exchange information with the world external to the computer. This can include a wired, wireless, acoustic, infrared, or other communications link (including specifically voice or data telephony); a keyboard, tablet, camera, video input, audio input, pen, or other sensor; and a display (2D or 3D, plasma, LED, CRT, tactile, or audio). That which allows another device, or a human, to interact with and exchange data with, or control and command, a computer, is an I/O device, without which any computer (or human) is essentially in a solipsitic state.

The above description of the invention is illustrative and not restrictive. Many variations of the invention may become apparent to those of skill in the art upon review of this disclosure. The scope of the invention should, therefore, be determined not with reference to the above description, but instead should be determined with reference to the appended claims along with their full scope of equivalents.

While the present invention has been described in connection with at least one preferred embodiment, these descriptions are not intended to limit the scope of the invention to the particular forms (whether elements of any device or architecture, or steps of any method) set forth herein. It will be further understood that the elements, or steps in methods, of the invention are not necessarily limited to the discrete elements or steps, or the precise connectivity of the elements or order of the steps described, particularly where elements or steps which are part of the prior art are not referenced (and are not claimed). To the contrary, the present descriptions are intended to cover such alternatives, modifications, and equivalents as may be included within the spirit and scope of the invention as defined by the appended claims and otherwise appreciated by one of ordinary skill in the art. 

1. A method for digital signal processing (wherein ‘digital’ refers to the nature of the computational processing and ‘signal’ refers to analog electromagnetic waveforms) on devices employing at least one adaptive processor that employs large numbers of adaptation weights for any adaptation criterion of the set of signal estimation, parameter estimation, measured quality, and any combination thereof for “adapt-path” operations used to tune the adaptive processor, not “data-path” operations used by the adaptive processor during and after tuning, said method comprising: for each adaptation event having M-dimensions, performing a dimensionality reduction comprising a linear transformation of the processor parameters being adapted from M-dimensions to (M₁+L)-dimensions in each adaptation event, said linear transformation further comprising: adapting M₁ weights without constraints; and, adapting M₀=M−M₁ weights subjected to L soft constraints forcing them into an L-dimensional subspace spanned by the those weights; applying the same dimensionality reduction to input data using the same linear transformation; and, adapting the reduced-dimensionality weights using the same optimization strategy employed by the adaptive processor, except with the input data to which the same dimensionality reduction has been applied.
 2. A method as in claim 1, wherein the step of performing a dimensionality reduction comprising a linear transformation of the processor parameters being adapted from M-dimensions to (M₁+L)-dimensions in each adaptation event further comprises: applying a subspace constraint described in the form M ₀ ^(T)(n)w′∝M ₀ ^(T)(n)w =M ₀ ^(T)(n)wg ₀; jointly adjusting the scalar hand-set multiplier g₀ and the update-set weights w₁=M₁(n)w to optimize an unconstrained criterion metric F(w;n) over adapt block n, said metric determined by $\left. w\leftarrow{\arg \underset{w^{\prime}}{opt}{F\left( {w^{\prime};n} \right)}} \right.$ subject to additional linear constraint (w′)_(m∈M) ₀ _((n))=(w)_(m∈M) ₀ _((n)); and, using the resulting full output weight vector in the adaptation event.
 3. A method as in claim 2 wherein the full output weight vector is efficiently computed using vector-scalar multiplies and multiply-free multiplexing (MUX) operations.
 4. A method as in claim 2, wherein the step of jointly adjusting the scalar hand-set multiplier g₀ and the update-set weights w₁=M₁(n)w to optimize an unconstrained criterion metric F(w;n) over adapt block n further comprises using a (M₁+1)×1 enhanced weight vector {tilde over (w)}= $\quad\begin{pmatrix} w_{1} \\ g_{0} \end{pmatrix}$ using optimization formula $\left. \overset{\sim}{w}\leftarrow{\arg \underset{{\overset{\sim}{w}}^{\prime}}{opt}{F\left( {{\overset{\sim}{w}}^{\prime};n} \right)}} \right.$ over each data block to produce the full output weight vector w=M₁(n)w₁+M₀(n)w₀g₀.
 5. A method as in claim 4 wherein the full output weight vector is efficiently computed using vector-scalar multiplies and multiply-free multiplexing (MUX) operations.
 6. A method for implementing partial-update methods (PUMs) in any adaptive processor that adjusts weights to optimize an adaptation criterion using any of a signal estimation and a parameter estimation algorithm in digital signal processing (wherein ‘digital’ refers to the nature of the computational processing and ‘signal’ refers analog electromagnetic waveforms) when said adaptive processor employs large numbers of adaptation weights for any adaptation criterion of the set of signal estimation, parameter estimation, measured quality, and any combination thereof for “adapt-path” operations used to tune the adaptive processor, not “data-path” operations used by the adaptive processor during and after tuning, said method comprising: for each adaptation event comprising a partial-update effected by the adaptive processor for an adapt path operation having M-dimensions, performing a dimensionality reduction comprising a linear transformation of the processor parameters being adapted from M-dimensions to (M₁+L)-dimensions in said partial update, said linear transformation further comprising: adapting M₁ weights without constraints; and, adapting M₀=M−M₁ weights subjected to L soft constraints forcing them into an L-dimensional subspace spanned by the those weights; applying the same dimensionality reduction to input data using the same linear transformation; and, adapting the reduced-dimensionality weights using the same optimization strategy employed by the adaptive processor, except with the input data to which the same dimensionality reduction has been applied.
 7. A method as in claim 6, wherein the step of performing a dimensionality reduction comprising a linear transformation of the processor parameters being adapted from M-dimensions to (M₁+L)-dimensions in each adaptation event further comprises: replacing a hard linear constraint describable in the form M₀(n)w′=M₀(n)w with a softer subspace constraint described in the form M ₀ ^(T)(n)w′∝M ₀ ^(T)(n)w =M ₀ ^(T)(n)wg ₀; jointly adjusting the scalar hand-set multiplier g₀ and the update-set weights w₁=M₁(n)w to optimize an unconstrained criterion metric F(w;n) over adapt block n, said metric determined by $\left. w\leftarrow{\arg \underset{w^{\prime}}{opt}{F\left( {w^{\prime};n} \right)}} \right.$ subject to additional linear constraint (w′)_(m∈M) ₀ _((n))=(w)_(m∈M) ₀ _((n)); and, using the resulting full output weight vector in the adaptation event
 8. A method as in claim 6 for partial-update affine projections, further comprising: separating w into update-set and held-set components w₁ and w₀ using multiply-free demultiplexing (DMX) operations; for SCPU-AP/NLMS SCPU-BLS algorithms with μ<1, construct a (M₁+1)×1 dimensional weight matrix ${\overset{\sim}{w} = \begin{pmatrix} w_{1} \\ 1 \end{pmatrix}};$ separating X(n) into update-set and held-set components X₁(n) and X₀(n) using multiply-free columnar DMX operations; computing y₀(n)=X₀(n)w₀; specifically for any SCPU-AP/NLMS algorithm, further computing y₁(n)=X₁(n)w₁ and y(n)=y₁(n)+y₀(n); constructing a N×(M₁+1) dimensional SCPU data matrix {tilde over (X)}(n)=[X ₁(n)y ₀(n)]; optimizing {tilde over (w)} using the original unconstrained algorithm, with dimensionality reduced from M to M₁+1; updating w₁ and w₀ using w₁←[({tilde over (w)})_(m)]_(m=1) ^(M) ¹ , w₀←({tilde over (w)})_(M) ₁ ₊₁w₀; and, reconstructing the linear combiner weights using w=M₁(n)w₁+M₀(n)w₀.
 9. A method as in claim 6 wherein the adaptation arises from and must apply over a multiport digital signal processing hardware any of an affine-projection and block least-squares adaptation algorithm.
 10. A method as in claim 9 wherein the multiport digital signal processing is uncoupled.
 11. A method as in claim 9 wherein the multiport digital signal processing is fully-coupled.
 12. A method for digital signal processing (wherein ‘digital’ refers to the nature of the computational processing and ‘signal’ refers to analog electromagnetic waveforms) on devices employing at least one adaptive processor that employs large numbers of adaptation weights for any adaptation criterion of the set of signal estimation, parameter estimation, measured quality, and any combination thereof for “adapt-path” operations used to tune the adaptive processor, not “data-path” operations used by the adaptive processor during and after tuning, said method comprising: for each adaptation event having M-dimensions, performing a dimensionality reduction comprising a linear transformation of the processor parameters being adapted from M-dimensions to (M₁+L)-dimensions in each adaptation event, said linear transformation further comprising: adapting M₁ weights without constraints; and, adapting M₀=M−M₁ weights subjected to L soft constraints forcing them into an L-dimensional subspace spanned by the those weights; applying the same dimensionality reduction to input data using the same linear transformation; and, adapting the reduced-dimensionality weights using substantively the same optimization strategy employed by the adaptive processor for the input data to which the same dimensionality reduction has been applied.
 13. A method as in claim 1 for partially blind methods in which the reference vector s(n) is partially known at the receive processor over adapt block n, as the reference vector has any of an unknown carrier and timing offset relative to the sequence contained in the input data sequence.
 14. A method as in claim 13 for any of timing and carrier tracking methods, wherein s(n) has an unknown offset between the input data and an original transmitted signal containing the reference signal, further comprising: replacing the nonblind weight adaptation algorithm {tilde over (w)}←(1−μ){tilde over (w)}+μ{tilde over (X)} ^(†)(n)s(n), 0<μ≦1 with {tilde over (w)}←(1−μ){tilde over (w)}+μ{tilde over (X)} ^(†)(n)(s({circumflex over (n)} _(off) ;n)∘δ(ω_(off))) s(n _(off) ;n)=[s(nN+n _(sym) +n _(off))]_(n) _(sym) ₌₁ ^(N) δ(ω_(off))=[e ^(jω) ^(off) ^(n) ^(sym) ]_(n) _(sym) ₌₁ ^(N) where {s(n_(sym))} is a component of the transmitted signal that is known over the adapt block except for the unknown offset; and, optimizing the unknown offset over each adapt block by setting ${\left\{ {{{\hat{\omega}}_{off}(n)},{{\hat{n}}_{off}(n)}} \right\} = {\arg {\max\limits_{\omega_{off},n_{off}}{\eta \left( {\omega_{off},{n_{off};n}} \right)}}}},\begin{matrix} {{{\eta \left( {\omega_{off},{n_{off};n}} \right)} = \frac{{{{{\overset{\sim}{Q}}^{H}(n)}\left( {{s\left( {n_{off};n} \right)}{{\bullet\delta}\left( \omega_{off} \right)}} \right)}}_{2}^{2}}{{{s\left( {n_{off};n} \right)}}_{2}^{2}}},} \\ {{= \frac{{{\sum\limits_{n_{sym} = 1}^{N}\; {{{\overset{\sim}{q}}^{*}\left( n_{sym} \right)}{s\left( {{nN} + n_{sym} + n_{off}} \right)}^{{j\omega}_{off}n_{sym}}}}}_{2}^{2}}{\sum\limits_{n_{sym} = 1}^{N}\mspace{11mu} {{s\left( {{nN} + n_{sym} + n_{off}} \right)}}^{2}}},} \end{matrix}$ where {tilde over (Q)}=[{tilde over (q)}(1) . . . {tilde over (q)}(N)]^(T) is the Q-component of the QRD of {tilde over (X)}(n) using fast Fourier transform (FFT) methods if the frequency offset ω is completely unknown (acquisition phases), and using Gauss-Newton or Newton methods if the frequency offset ω is known closely (tracking phases).
 15. A method as in claim 1 for fully blind methods in which the reference vector is unknown but has some known, exploitable structure.
 16. A method as in claim 1 for property-mapping methods in which s(n) is a member of a known property set, said method further comprising: replacing the non-blind weight adaptation algorithm $\left. \overset{\sim}{w}\leftarrow{{\left( {1 - \mu} \right)\overset{\sim}{w}} + {\mu \; {{\overset{\sim}{X}}^{\dagger}(n)}{s(n)}}} \right.,\mspace{14mu} {0 < \mu \leq 1}$ with ${{y(n)} = {{\overset{\sim}{X}(n)}\overset{\sim}{w}}},{{\hat{s}(n)} = {\arg {\min\limits_{s \in {{(n)}}}{{s - {y(n)}}}}}},{and}$ $\left. \overset{\sim}{w}\leftarrow{{\left( {1 - \mu} \right)\overset{\sim}{w}} + {\mu \; {{\overset{\sim}{X}}^{\dagger}(n)}{\hat{s}(n)}}} \right.;$ where

(n) is a desired signal set, potentially variable as a function of adapt block n, that s(n) is known to belong to.
 17. A method as in claim 1 for dominant-mode prediction (DMP) methods, in which s(n) is known to be substantively present in a linear subspace with any of a known or estimable structure, said method further comprising: using an enhanced weight algorithm effecting ${\overset{\sim}{w} = {\arg {\max\limits_{w \in {\mathbb{C}}^{M_{1} + 1}}{\overset{\sim}{\gamma}\left( {\omega;n} \right)}}}},{{\overset{\sim}{\gamma}\left( {w;n} \right)} = \frac{{w^{H}\left( {{{\overset{\sim}{X}}_{s}^{H}(n)}{{\overset{\sim}{X}}_{s}(n)}} \right)}w}{{w^{H}\left( {{{\overset{\sim}{X}}_{\bot}^{H}(n)}{{\overset{\sim}{X}}_{\bot}(n)}} \right)}w}},\left\{ \begin{matrix} {{{\overset{\sim}{X}}_{s}(n)} = {{U_{s}^{H}(n)}{\overset{\sim}{X}(n)}}} \\ {{{\overset{\sim}{X}}_{\bot}(n)} = {{U_{\bot}^{H}(n)}{\overset{\sim}{X}(n)}}} \end{matrix} \right.$ wherein the enhanced combiner weights {tilde over (w)}_(max) and the maximal value of {tilde over (γ)}_(max), are equal to the dominant solution {{tilde over (γ)}₁,{tilde over (w)}₁} of the DMP eigenequation {tilde over (γ)}_(m)({tilde over (X)} _(⊥) ^(H)(n){tilde over (X)} _(⊥)(n)){tilde over (w)} _(m)=({tilde over (X)} _(s) ^(H)(n){tilde over (X)} _(s)(n)){tilde over (w)} _(m), {tilde over (γ)}_(m)≧{tilde over (γ)}_(m+1) and the dominant eigenvalue {tilde over (γ)}₁ also provides an estimate of the SINR of the combiner output signal, such that the dominant eigenvalue {tilde over (γ)}₁ also is usable both to detect the target signal, and to search over postulated subspaces to find the subspace that most closely contains or rejects s(n).
 18. A method as in claim 1 for conjugate self-coherence restoral (C-SCORE) methods in which s(n) is known to have substantive conjugate self-coherence at some known or estimable frequency offset ω, such that $\frac{{\sum\limits_{n_{sym} = 1}^{N}\; {{s^{2}\left( {{nN} + n_{sym}} \right)}^{{- {j\omega}}\; n_{sym}}}}}{\sum\limits_{n_{sym} = 1}^{N}\; {{s\left( {{nN} - n_{sym}} \right)}}^{2}} \approx 1$ said method further comprising: using for the enhanced weight algorithm for a postulated twice-carrier offset ω, $\begin{matrix} {{\overset{\sim}{w} = {\arg {\max\limits_{w \in {\mathbb{C}}^{M_{1} + 1}}{\overset{\sim}{\rho}\left( {\left. w \middle| \omega \right.;n} \right)}}}},} \\ {{{\overset{\sim}{\rho}\left( {\left. w \middle| \omega \right.;n} \right)} = \frac{{{w^{H}\left( {{{\overset{\sim}{X}}^{H}(n)}{\Delta (\omega)}{{\overset{\sim}{X}}^{*}(n)}} \right)}w^{*}}}{{w^{H}\left( {{{\overset{\sim}{X}}^{H}(n)}\overset{\sim}{X}(n)} \right)}w}},{{\Delta (\omega)} = {{diag}\left\{ ^{{j\omega}\; n_{sym}} \right\}_{n_{sym} = 1}^{N}}}} \end{matrix}$ wherein the enhanced combiner weights {tilde over (w)}_(max) and the maximal value {tilde over (w)}_(max) are equal to the dominant solution {{tilde over (ρ)}₁(ω),{tilde over (w)}₁(ω)} of the C-SCORE pseudo-eigenequation, {tilde over (ρ)}_(m)(ω)({tilde over (X)} ^(H)(n){tilde over (X)}((n)){tilde over (w)} _(m)(ω)=({tilde over (X)} ^(H)(n)Δ(ω){tilde over (X)}*(n)){tilde over (w)} _(m)*(ω), {tilde over (ρ)}_(m)≧{tilde over (ρ)}_(m+1). 